Biophysical Journal: Biophysical Letters 



Chain Length Determines the Folding Rates of RNA 



Changbong Hyeon,*and D. Thirumalait 

*School of Computational Sciences, Korea Institute for Advanced Study, Seoul 130-722, Korea and tBiophysics Program, Institute for Physical 
Science and Technology, University of Maryland, College Park, MD 20742 



(N 

o 



ABSTRACT We show that the folding rates (k F s) of RNA are determined by N, the number of nucleotides. By assuming that 
the distribution of free energy barriers separating the folded and the unfolded states is Gaussian, which follows from central 
limit theorem arguments and polymer physics concepts, we show that k F w k exp (— aiV 0,5 ). Remarkably, the theory fits the 
experimental rates spanning over seven orders of magnitude with k ~ l.Ofjtts) -1 . An immediate consequence of our finding 
is that the speed limit of RNA folding is about one microsecond just as it is in the folding of globular proteins. 
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RNA molecules are evolved biopolymers whose folding 
has attracted a great deal of attention [1, 2, 3] because of 
the crucial role they play in a number of cellular functions. 
The slightly branched polymeric nature of RNA implies 
that the shapes, relaxation dynamics, and even their folding 
rates must depend on N. In support of this assertion it has 
been shown that the radius of gyration of the folded states, 
using data available in Protein Data Bank (PDB), scales 
R g ~ 5.5N l/ A, where the Flory exponent v varies from 
0.33 to 0.40 [4, 5, 6]. Although this result is expected from 
the perspective of polymer physics it is surprising from the 
viewpoint of structural biology because it might be argued 
that sequence and the complexity of secondary and tertiary 
structure organization could lead to substantial deviations 
from the predictions based on Flory-like theory. Here, we 
show that folding rates, fc^s, of RNA are also primarily de- 
termined by N, thus adding to the growing evidence that 
it is possible to understand folding of RNA using polymer 
physics principles. 

Theoretical Considerations. Theoretical arguments, with 
genesis in the dynamics of activated transitions in disordered 
systems, suggest that 



koexpi-aN 13 ) 



(1) 



where (3 should be 0.5 [7]. The rationale for this finding 
hinges on the observation that favorable base-pairing interac- 
tions and the hydrophobic nature of the bases tend to collapse 
RNA whereas the charged phosphate residues are better ac- 
commodated by extended structures. Thus, the distribution 
of activation free energy, AG^p/ksT, between the folded 
and unfolded states is a sum of favorable and unfavorable 
terms. We expect from central limit theorem that the distri- 
bution of AGu F /kBT should be roughly Gaussian with dis- 
persion ((AG % UF ) 2 ) 
ii = 1/2. 



We analyzed the available experimental data (see the Ta- 
ble for a list of RNA molecules) on RNA folding rates by 
assuming that AG\j F grows as N@ with j3 as a free param- 
eter. The theoretical value for j3 is 0.5. The folding rates of 
RNA spanning over seven orders of magnitude is well fit us- 
ing log/cp = logfco — aN 13 with correlation coefficient of 
0.98 (Fig.l). The fit yields k^ 1 = 0.87 fis, a = 0.91 and 
j3 ps 0.46. In the inset we show the fit obtained by fixing 
(3 = 0.5. Apart from the moderate differences in the k^ 1 
values the theoretical prediction and the numerical fits are in 
agreement, which demonstrates that the major determining 
factor in determining RNA folding rates is N. 

It is known that RNA, such as Tetrahymena ribozyme, 
folds by multiple pathways that is succinctly described by 
the kinetic partitioning mechanism (KPM) [8]. According to 
KPM a fraction, <E>, of molecules reaches the native states 
rapidly whereas the remaining fraction is trapped in an 
ensemble of misfolded intermediates. For Tetrahymena ri- 
bozyme $ ~ 0.1 [9]. The A-dependence given by Eq. (1) 
holds for the majority of molecules that fold to the native 
state from the compact intermediates, which form rapidly 
under folding conditions [10]. 

Conclusions. Implications of our findings are: (i) The in- 
verse of the prefactor, k^ 1 = ro ps 0.87 fis, is almost six or- 
ders of magnitude larger than the transition state theory esti- 
mate of h/ksT ps 0.16 ps. The value of To, which coincides 
with the typical base pairing time [11], is the speed limit 
for RNA folding. Interestingly, general arguments based on 
the kinetics of loop formation have been used to predict 
that the speed limit for protein folding is also about one 
jis [12, 13, 14]. It remains to be ascertained if the com- 
mon folding speed limit for proteins and RNA is due to 
evolutionary pressure on the folding of evolved sequences. 



N. Thus, AGl F /k B T ~ N? with 
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FIGURE 1 Dependence of folding rates of RNA on N. The 
green circles are experimental data and the red line is the fit 
using logfc F = logfc - aN@ using /3 as an adjustable parame- 
ter. The inset shows the fit obtained by fixing p to the predicted 
theoretical value of 0.5. 



(ii) It is worth pointing out that Dill [15] has recently shown 
that the dependence of rates and stabilities of protein folding 
depend only the number of amino acids, which in turn places 
strict constraints on their functions in the cellular context. 
Taken together these studies show that despite the complex- 
ity of protein and RNA folding only a few variables might 
determine their global properties, which suggests that there 
may be simple principles that determine biological functions. 
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Table 1 RNA length versus Folding rate. 



RNA 


N 


kf(sec i ) 




GCUUCGGC [16] 


8 


6.7 x 10 4 


tetraloop hairpin 


GCUUCGGC [16] 


8 


27.2 x 10 4 


tetraloop hairpin 


GGUUCGCC [16] 


8 


1.3 x 10 4 


tetraloop hairpin 


GGUUCGCC [16] 


8 


4.7 x 10 4 


tetraloop hairpin 


GGACUUUUGUCC [16] 


12 


6.1 x 10 4 


tetraloop hairpin 


GGACUUCGGUCC [16] 


12 


4.5 x 10 4 


tetraloop hairpin 


A 6 C 6 U 6 [17] 


18 


3.4 x 10 4 


tetraloop hairpin 


Extra-arm of tRNA Ser (yeast) [18] 


21 


1 x 10 5 


tRNA 


pG-half of tRNA phe (yeast) [18] 


36 


9 x 10 3 


tRNA 


CCA-half of tRNA phe (yeast) [18] 


39 


8.5 x 10 3 


tRNA 


CCA-half of tRNA phe (wheaf) [18] 


39 


8 x 10 3 


tRNA 


tRNA phe (yeast) [19] 


76 


5.3 x 10 2 


tRNA 


tRNA Ala (yeast) [18] 


77 


9 x 10 2 


tRNA 


Y 4 -hairpin [20] 


14 


5.75 x 10 4 


hairpin (5x2+4) 


Yg -hairpin [20] 


19 


2.29 x 10 4 


hairpin (5x2+9) 


Y\ 9 -hairpin [20] 


29 


8.70 x 10 2 


hairpin (5x2+19) 


I34 -hairpin [20] 


44 


6.03 x 10 2 


hairpin (5 X 2+34) 


VPK pseudoknot [21] 


34 


9.09 x 10 2 


pseudoknot 


Hairpin ribozyme (4-way junction) [22, 23] 


125 


6 


natural form of hairpin ribozyme 


P5abc [24] 


72 


50 


Group I intron T. ribozyme 


P4-P6 domam(Tetrahymena ribozyme) [24] 


160 


2 


Group I intron T. ribozyme 


Azoarcus [25, 23] 


205 


7 ~ 14 




B.subtilis RNase P RNA catalytic domain [26] 


225 


6.5 ±0.2 




Ca.L- 1 1 ribozyme [27] 


368 


0.03 




E.coli RNase P RNA [28] 


377 


0.011 ±0.001 




B.subtilis RNase P RNA [28] 


409 


0.008 ± 0.002 




Tetrahymena ribozyme [29, 23] 


414 


0.013 


Group I intron T. ribozyme 
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